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Abstract — In his thesis, Wiberg showed the existence of thresh- 
olds for families of regular low-density parity-check codes under 
min-sum algorithm decoding. He also derived analytic bounds on 
these thresholds. In this paper, we formulate similar results for 
linear programming decoding of regular low-density parity-check 
codes. 

I. Introduction 

The goal of this paper is to shed some light on the 
connection between min-sum algorithm (MSA) decoding and 
the formulation of decoding as a linear program. In particular, 
we address the problem of bounding the performance of 
linear programming (LP) decoding with respect to word error 
rate. The bounds reflect similar analytic bounds for MSA 
decoding of low-density parity-check (LDPC) codes due to 
Wiberg [1] and establish the existence of an SNR threshold for 
LP decoding. While highly efficient and structured computer- 
based evaluation techniques, such as density evolution (see 
e.g. [2], [3], [4], [5], [6]), provide excellent bounds on the 
performance of iterative decoding techniques, to the best of 
our knowledge, the best analytic performance bound in the 
case of MSA decoding is still the bound given by Wiberg 
in his thesis based on the weight distribution of a tree-like 
neighborhood of a vertex in a graph. A similar bound was 
also derived by Lentmaier et al. [5]. We derive the equivalent 
bound for LP decoding of regular LDPC codes. 

II. Notation and Basics 

In this paper we are interested in binary LDPC codes where 
a binary LDPC code C of length n is defined as the null- 
space of a sparse binary parity-check matrix H, i.e. C = 
{x e I Hx^ = 0^}. In particular we focus on the case 
of regular codes: an LDPC code C is called (J, iir)-regular 
if each column of H has Hamming weight J and each row 
has Hamming weight K. The rate of a (J, i4r)-regular code is 
lower bounded by I — J/ K. To an M x iV parity-check matrix 
H we can naturally associate a bipartite graph, the so-called 
Tanner graph T(H). This graph contains two classes of nodes: 
variable nodes Vv and check nodes Vc. Both variable nodes 
and check nodes are identified with subsets of the integers. 
Variable nodes are denoted as Vv = {0, 1, . . . , — 1} and 
check nodes are denoted as Vc = {0, 1, . . . , M — 1}. Whenever 
we want to express that an integer belongs to the set of variable 



nodes we write i e Vv; similarly, when an integer belongs 
to the set of check nodes we write j E Vc- The Tanner 
graph T(H) contains an edge between node i E Vv 

and j £ Vc if and only if the entry hij is non-zero. The set 
of neighbors of a node i e Vv is denoted as d{i); similarly, 
the set of neighbors of a node j G Vc is denoted as d{j). In 
the following, £ ^ G Vv x Vc M G Vv, j £ d{i)} = 

{(ij) G Vv X Vc I j G Vc, i G d{j)} win be the set of 
edges in the Tanner graph T(H). The convex hull of a set 
C M" is denoted by conv(^). If is a subset of then 
conv(^) denotes the convex hull of the set A after A has 
been canonically embedded in R". The inner product between 
vectors x and y is denoted as (x, y) = J^i^iUi- Finally, we 
define the set of all binary vectors of length K and even weight 
as 

In the rest of this paper we assume that the all-zeros word 
was transmitted — an assumption without any essential loss 
of generality because we only consider binary linear codes 
that are used for data transmission over a binary-input output- 
symmetric channel. Given a received vector y we define the 
vector A = (Ao, Ai, . . . , Xn-i) of log-likelihood ratios by 

. , f PY\x{y^\0) \ 

III. LP Decoding 

Maximum likelihood (ML) decoding may be cast as a linear 
program once we have translated the problem into M^. To 
this end we embed the code into by straightforward 
identification of F2 = {0, 1} with {0, 1} C M. In other words, 
a code C is identified with a subset of {0, 1}^ C M^. 



Maximum Likelihood Decoding 

Minimize: (A, x) 
Subject to: x G conv(C) 



This description is usually not practical since the polytope 
conv(C) is typically very hard to describe by hyperplanes 
(or as a convex combination of points). Given a parity-check 
matrix H, the linear program is relaxed to [7], [8] 



LP Decoding 



Minimize: (A, x) 
Subject to: x G P(H) 



Here, P(H) is the so-called fundamental polytope [7], [8], 
[9], [10] which is defined as 

M-l 

P(H) ^ Pi conv(C,), 

j=0 

where 

Cj = Cj{U) ^ {x G I hjx^ = (mod 2)} , 

where is the j-th row of H. 

Since is always a feasible point, i.e. G 'P(H) holds, zero 
is an upper bound on the value of the LP in LP decoding. 
In fact, we can turn this statement around by saying that 
whenever the value of the linear program equals zero then 
the all-zeros codeword will be among the solutions to the LP. 
Thus, motivated by the assumption that the all-zeros codeword 
was transmitted, we focus our attention on showing that, under 
suitable conditions, the value of the LP is zero which implies 
that the all-zeros codeword will be found as a solution. For 
simplicity we only consider channels where the channel output 
is a continuous random variable. In this case a zero value of 
the LP implies that the zero word is the unique solution with 
probability one. The main idea now is to show that the value 
of the dual Unear program is zero. This technique, dubbed 
"dual witness" by Feldman et al. in [11] will then imply the 
correct decoding. 

First, however, we need to estabUsh the dual linear program. 
To this end, for each G £, we associate the variable Tij 
with the edge between variable node i and check node j in the 
Tanner graph T(H). In other words, we have a variable Tij 
if and only if the entry hij is non-zero. For each j G Vc we 
define the vector Tj that collects all the variables {tij }iea(j) ■ 
Also, for each j G Vc, we associate the variable Oj with the 
check node j. We have^ 



Dual LP 




Maximize: X^j^lo^ 




Subject to: 9j < (x, Tj) 


V j G Vc, Vx G -B^-^) 




ViG Vv 



The dual program has a number of nice interpretations. Any 
6j is bounded from above by zero and can only equal zero 
if the vector Tj has minimal correlation with the all-zeros 
codeword.^ Thus the dual program will only get a zero value 



IV. MSA Decoding 

While MSA decoding is not the focus of interest in this 

paper, it turns out that the MSA lies at the core of the proof 
technique that we will use. The MSA is an algorithm that is 
being run until a predetermined criterion is reached. With each 
edge in the graph we associate two messages: one message is 
going towards the check-node and one is directed towards the 
variable node. Let the two messages be denoted by i^ij and 
Vij, respectively, where, as in the case of the single variable 
Tij in the section above, variables are only defined if the entry 
hij is non-zero. The update rules of MSA are then 



Min-Sum Algorithm (MSA) 



Initialize all variables Vij to zero. 

1) For aU G £, let 

fed(i)\{j} 

2) For all G £, let 

^^,3 — I n Sign(/ij,i/) 

^i'edU)\{i} 

■mm{\iij^i,\ : i' G d{j)\{i}}. 



Rather than the quantity Uij we will consider its negative 

value. Moreover, we keep track of the messages that were 
sent by message numbers in the superscript. Thus we modify 
the MSA update equations as 



Modified Min-Sum Algorithm (modified MSA) 



InitiaUze all variables v^j to zero. 
1) For all G £, let' 



/y("^ — A- 



E 



J ,1 



j'ed{i)\{j} 



2) For all G £, let 



n sign(/x5J) 



U'ed{j)\{i} 



imn||//;/^;l| : /' G r}(j) \ {/}} 



Clearly, the sign change leaves the algorithmic update steps 
essentially unchanged. (Note that e.g. when all {l^ijjiedU) 
are non-negative then all {vl^j}i£d(i) will be non-positive.) 



if we find an assignment to ^ such that the local all-zeros ^^^^^^ ^-g- "^^^^^ + S 

words are among the "best" words for all j. We are constraint 
in setting the Tjj-values by the second equality constraint. 



is) 

red{i)\{j} •^3'. I 



'in the formal dual program the equality constraint X]_jg 



■ Ai is 



an inequality (<). However, there always exists a maximizing assignment of 
dual variables that satisfies this conditions with equaUty. 

^In a generalized LDPC code setting, the local code would have to 

be replaced by the corresponding code. 



Xi which 

more closely reflects the structure of the dual program above. 

We will need the notion of a computation tree (CT) [1]. We 
can distinguish two types of CTs, rooted either at a variable 
node or at a check node. Our CTs will be rooted at check nodes 
which is more natural when deaUng with the dual program. 
A CT of depth L consists of all nodes in the universal cover 
of the Tanner graph that are reachable in 2L — 1 steps. In 



particular, we will most of the time assume that the leaves in 
the CT are variable nodes. 

Assume we have run the MSA for L iterations, correspond- 
ing to a CT of depth L. For the moment let us also assume 
that the underlying graph has girth larger than 4L. Based on 
the iterations of the MSA and fixed CT root node jo G Vc we 
can assign values to the dual variables in the following way. 

Was assign values to Tij according to the distance of the 
edge to the root node of the CT. So, if is at 

distance 21+1 from the root node jo then r,,j is assigned 
the value and if (i, j) is at distance 21 + 2 from the 

root node jo then Tij is assigned the value i^l j . Let us 
denote this assignment to variables nj as T(jo, L^. Note that 
the assignment T{jo,L) does not satisfy the constraints of 
the dual hnear program, i.e. itself it is not dual feasible. In 
particular, any edge of distance more than 2L from the root 
is assigned the value and hence at any variable node i at 
distance more than 2L from the root we do not satisfy the 
constraint 

unless Aj happens to be 0. However, we have the following 
lemma. 

Lemma 1: For each jo € Vc let an assignment r{jo,L) be 
given based on L iterations of the MSA. The sum 

r(L)4 5^ T(jo,L) 

,70 ev, 

is a multiple of a dual feasible point. More precisely, for the 
number T{L) = J^Li -^[(^ " 1)] ^^"^^ the vector 

^ T(L) 



T(L) 



is a dual feasible point. 

Proof: Each variable node z G Vv is part of ^^^i J[{K — 
1)(J — 1)]^^ CTs for different root nodes jo and so one 
can verify that we must have Tij{L) = J2joeVc '^hjUo, L) = 



directly. If we perform L steps of iterative decoding, for any 
edge £ £ we can write 

nj{L) = f,^ + (J - 1) + {K- i)mS"'^) 

+ (J - ifiK - 1) (z.g-^) + {K- l)/xg-'^) 
+ ■•• . 

Written in form of a telescoping sum we get 

nj{L) = + (J - 1) (^i.g) + {K-\) (^g-^) + 



L-l) 

i 



+ (i^-l)(Mg-') + 



While the above sums show that the dual feasible point can 
be easily computed alongside the MSA recursions it also 



shows the problem that messages and vf- are weighted 



Eti^[(^- - l)]^'-'^A.._ Using the abbreviation ^ .^^^^ ;;,(jo, L, a) = Eti "^-i J[(^-1)(J-1)] ''"'^i- 



exponentially more for small values of I. 

We will have to attenuate the influence of the leaves in 
the CTs in order to make interesting statements. To this end, 
let q: be a vector with positive entries of length L and let 
a generahzed assignment T{jo,L,a.) to dual variables be 
derived from T(jo,i) by multiplying the message on each 
edge at distance 2i? + 1 or 2^ + 2 by a^.^ In other words, 
values assigned to edges at distance three or four from the 
root node are multiplied with ai, values at distance five and 
six are multiplied with a2 etc. Again we can form the multiple 
of a dual feasible point as is shown in the next lemma. 

Lemma 2: For each jo S Vc let an assignment T(jo, L) be 
given based on L iterations of the MSA. The sum 

r(L,a) = ^ T(jo,L,a) 
is a multiple of a dual feasible point. 

Proof: Each variable node i S Vy is part of X^fci ^[(^ ~ 
1)(J — 1)]^^ CTs for different root nodes jo. Because 
all edges incident to a variable node are attenuated in the 
same way, one can verify that we must have Tij{L,a.) = 



T{L) ^ J2e=i - - 1)] we see fliat 

1 



Using flie abbreviation T(L) = T,J^iae-iJ[iK - 1)(J 



T{L) 



t(L) 



1)]^ ^ we see that 



is a dual feasible point. □ 
The above lemma gives a structured way to derive dual 
feasible points for LP decoding from the messages passed 
during the operation of the MSA. However, these points are 
not very good since the overall assignment t(L) is again 
dominated by the leaves of the CT with all the pertaining 
problems. The problem becomes obvious when we write out 
the assignment t{L) as a function of the MSA messages 

^Edges incident to the root are said to be at distance one. If the distance 
of the edge (i, j) to the root jo is larger than 2L then Tij = 0. 

''The jo indicates that the assignment is based on the CT rooted at node 
jo- 



T{L) 



T{L) 



is a dual feasible point. □ 
Optimizing the vector cx gives us some freedom and we 
want to choose the vector a appropriately. First we have to 
learn more about the dual feasible point that we construct 
in this way. While we kept the feasibility of an assignment 
a.) by identically scaling the values Tij that are adjacent 
to a variable node in a CT, we scale values t,,j that are adjacent 
to check nodes differently. Given a vector a, the dual feasible 

^An edge that is incident to a node j is said to be at distance one from j; 
ao is set to one. 



point may be easily computed together with the messages of 
the MSA. To this end define a vector (3 with components f3( = 
Writing again the dual variable Tij{L,a.) as functions 

of fi^j and h''^^j we get 

+ (J - ifiK 1) (ai^^^) + {K~ l)a2M!^'^) 
H . 

Written in form of a telescoping sum we obtain 

A particularly interesting choice for Pi is = TT^- The 
main reason for this choice is given in the following lemma. 

Lemma 3: Let K > 2 and fix some j G Vc. Assume the 
MSA yields messages where fifj is positive for all i G d{j) 
for some £. The inner product 

is non-negative for all b G S^^-*, in particular it is positive 
for all b G S^-^') \ {0\.^ 

Proof: Recall that v^j is negative for all (i, j) G £ (this is in 
line with the modification of the MSA). One can easily verify 

(I) (£+1) 

the following fact about the vector containing fi] ^ + vl j for 
all i £ d{j): there is only one negative entry and the absolute 
value of this entry matches the absolute value of the smallest 
positive entry. The statement follows. □ 
With the choice of a, = {K — 1)^', which results in Pi — 
■j^^, we get the following expression for the dual feasible 
point 

or 

We are still in a situation where /i^^j* is weighted by a factor 
that grows exponentially fast in L. However, we note that, once 
the MSA has converged, jii j also grows exponentially fast in 

*We assume that the indices of b are given by d{j). 



i and this offsets, to some extend, the exponential weighing 
of /i| In order to exploit this fact more systematically we 

initialize the MSA's check to variable messages v^^j, G 
£, with — [/, where [/ is a large enough positive number. With 
this initialization we can guarantee (for K > 2) for all G 
£ that the value of /ij ^ is strictly positive. Thus we can apply 
Lemma |3] It remains to offset the choice v^^^j with fJ-l^j'- 

To this end we consider a CT of depth L rooted at check 
node jo- Consider the event Aj^ that the all-zeros word on this 
CT is more likely than any word that corresponds to a local 
nonzero word assigned to the root node. ^. 

Lemma 4: Let K > 2 and assume the event A.j^^ is true. 
Moreover, assume that we initialize the MSA with check to 
variable messages vf^^ = —U, G £, for a large enough 
number U. The inner product 

»ea(i) 

is non-negative for all b G in particular it is non- 

negative for all b G S*^' \ {0}. 

Proof: We exploit the fact that summaries sent by the MSA 
can be identified with cost differences of log-likelihood ratios. 
Consider a message on edge {i,ja). This message may 

be written as = pi — {J — l)^h'i^j^^ for some pi. 

Since the MSA propagates cost summaries along edges, we 
can interpret pi as the summary of the cost due to the 
\i inside the subtree that emerges along the edge {i,jo). 
Similarly, (J — l)^!^! is the cost contributed by the leaf 
nodes of this sub-tree. Here we use the fact that the mini- 
mal codeword which accounts for a one-assignment in edge 
(i,jo) contains exactly (J — 1)^ leaf nodes with a one- 
assignment. But then the vector {fi[^l, fi'^^^l, j„) + 
(J - ,^.J equals the vector p = 

{pi,P2,. . . , P\d{jo)\)- The event Ajg is ti'ue only if the inner 
product (p, b) is positive for all b G S*^^^ \ {0}. Hence event 
Ajg implies the claim of the lemma. □ 
Let T* be the averaged assignment to the dual variables 
obtained from the MSA messages with set to —U. 

Lemmas |3l and 0] imply that the sum, 

has a non-negative value for any b G B^-^\ and, in particular, 
the value equals zero for b = 0. It follows that dj^ in the dual 
LP can be chosen as zero. 

For each check node j for which event Aj is true we can be 
sure that the correlation of any codeword in S'^^-' with t* is 
non-negative. If we can be sure that the event Aj is true for all 
check nodes we would, thus, have exhibited a dual witness for 
the optimality of the all-zeros codeword. We have to estimate 
the probability of the event Aj and set it in relation to the 

'We may choose as any number greater than | min(Ai)/(J — 2)|. 
*Event Ajg is defined on the CT without the change in initialization 



number of checks in the graph T(H). In order to estimate the 
latter we employ a result by Gallager [12] that guarantees the 
existence of (J, A')-regular graphs in which we can conduct L 
steps of MSA decoding without closing any cycles provided 
that L satisfies 



L < 



(1) 



21og((J-l)(X-l)) 

where the term k in this expression is independent of N. 

Finally, we can estimate the probability of the event Aj from 
the known weight distribution of the code on the CT provided 
the underlying graph has girth at least 4i. The minimal 
codewords have weight 2(1+(J-1)+(J-1)2+. • • 
and there are a total of 

= K/2{K - i)2(i+(J-i)+(^-i)'+--(J-i)''"') 

minimal-tree codewords. Based on a union bound we thus get 
an expression 



P{A,)<^{{K-ihy 



(2) 



which means that P{Aj) decreases doubly exponentially in L 
if the Bhattacharyya parameter 7 satisfies 7 < -j^^ . 
Thus we have proved the following theorem: 
Theorem 5: Let a sequence of (J, )-regular LDPC codes 
be given that satisfies equation Q. Under LP decoding this 
sequence achieves an arbitrarily small probability of error 
on any memoryless channel for which the Bhattacharyya 
parameter 7 satisfies 7 < -j^zri- For such a channel the word 
error probability Pw decreases as 



Pv^<r;i2-"=^^'°*"'^^^'<"-^» 

for some positive parameters rji and 772. 
Proof: Most of the proof is contained in the arguments leading 
up to the theorem. In order to see the explicit form of the word 
error rate we employ a union bound for the Nj^ check nodes 
combining and (|2}. We find that the word error rate is 
bounded by 

log(JV) 



)2 1og((J-l)(if-l)) 



where k does not depend on N. The statement of the theorem 
is obtained by simplifying this expression. □ 



We conclude this paper with an intriguing observation con- 
cerning the AWGN channel. In [10] it is proved that no (J, K)- 
regular LDPC code can achieve an error probability behavior 

2 1og(.J-l) 

better than Pw > 7^32-''*^ for constants 773 
and 774 that are independent on N. The result of the theorem 
thus shows that there exist sequences of LDPC codes whose 
error probability behavior under LP decoding is boxed in as a 
function of N between: 

i°g(j-i) 



?732 



-?j4Ari°g((J-i)(-ff-i)) 



_^2Ar2iog((j-i)(K-i)) 
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